Join our client, a premier High Frequency Trading firm with global influence in their Sydney office, designed to be a high-speed APAC hub for tech-driven trading. Our client owns mission-critical systems, moves fast, and continuously delivers. Their work is valued, visible, and essential to business success.
What You’ll Do:
As a Cloud Infrastructure & Reliability Innovator, you’ll play a key role in designing, building, and maintaining the systems that power our critical applications. You'll collaborate with developers and IT teams to ensure the scalability, reliability, and performance of our infrastructure — all while using automation and observability to stay ahead of incidents.
Your Responsibilities Will Include:
• Designing and implementing scalable, highly available, and fault-tolerant systems in the cloud
• Developing infrastructure as code using tools like Terraform, Ansible, or Pulumi
• Building and maintaining CI/CD pipelines for seamless deployments
• Monitoring application performance and proactively resolving issues
• Managing incident response and postmortems to continuously improve system resilience
• Automating operational tasks to eliminate toil and improve efficiency
• Working cross-functionally with developers, QA, and DevOps to deliver top-tier software
• Participating in on-call rotations and helping improve alerting and incident systems
• Ensuring systems meet security and compliance standards
Why Join Us:
• Work on mission-critical infrastructure in a fast-paced, forward-thinking environment
• Collaborate with world-class engineers passionate about reliability and automation
• Opportunities for continuous learning and certification
• Flexible working, including remote options
• Competitive salary, bonus structure, and benefits package
About You:
• Proven experience in SRE, DevOps, or infrastructure engineering roles
• Proficiency with cloud platforms (AWS, GCP, or Azure)
• Strong coding/scripting skills in Python, Go, Bash, or similar
• Familiarity with container orchestration (Kubernetes, Docker)
• Deep understanding of monitoring and observability tools (Prometheus, Grafana, ELK, etc.)
• Passion for system reliability, performance, and automation
• Excellent problem-solving and communication skills
Join us in redefining how reliable, scalable, and high-performance systems are built. If you're driven by solving complex infrastructure challenges with clean, automated solutions — we want to hear from you.
SiteReliabilityEngineer #SRE #DevOps #CloudInfrastructure #InfrastructureAsCode #Kubernetes #Automation #CI_CD #Monitoring #Observability #Terraform #Python #AWS #SystemReliability #EngineeringJobs